Wikipedia-based Compact Hierarchical Semantics with Application to Semantic Relatedness
نویسندگان
چکیده
A proper semantic representation of words and texts underlies many text processing tasks. In this paper, we present a novel representation of semantics which is based on an hierarchical ontology of natural concepts derived from Wikipedia articles and category system. Our method, called Compact Hierarchical Explicit Semantic Analysis (CHESA) generates compact hierarchical representations of unrestricted natural language texts. With comparison to previous methods for semantic representations, CHESA generates very intuitive and comprehensible representations allowing deep semantic reasoning and understanding. CHESA representations are flexible with regards to their level of abstraction and compactness. We present a methodology to compute semantic relatedness using CHESA representations and evaluate CHESA on the task of semantic relatedness assessment of words and texts. Empirical results show that for compact representations, CHESA is superior to the previous state of the art.
منابع مشابه
Wikipedia-based Compact Hierarchical Semantics for Natural Language Processing
A correct semantic representation of words and texts underlies many text processing tasks such as text categorization, word sense disambiguation, and semantic relatedness assessment. It has long been recognized that computers require access to common-sense and domain-specific world knowledge in order to process textual data at a deeper level. In this paper, we present a novel representation of ...
متن کاملDistributional Semantics for Entity Relatedness
Wikipedia provides an enormous amount of background knowledge to reason about the semantic relatedness between two entities. In this work, we present a distributional semantics based approach for computing entity relatedness, and a focused related entities explorer based on this approach.
متن کاملWikipedia-based Distributional Semantics for Entity Relatedness
Wikipedia provides an enormous amount of background knowledge to reason about the semantic relatedness between two entities. We propose Wikipedia-based Distributional Semantics for Entity Relatedness (DiSER), which represents the semantics of an entity by its distribution in the high dimensional concept space derived from Wikipedia. DiSER measures the semantic relatedness between two entities b...
متن کاملAdvertising Keyword Suggestion Using Relevance-Based Language Models from Wikipedia Rich Articles
When emerging technologies such as Search Engine Marketing (SEM) face tasks that require human level intelligence, it is inevitable to use the knowledge repositories to endow the machine with the breadth of knowledge available to humans. Keyword suggestion for search engine advertising is an important problem for sponsored search and SEM that requires a goldmine repository of knowledge. A recen...
متن کاملReality is not a game! Extracting Semantics from Unconstrained Navigation on Wikipedia
Semantic relatedness between words has been successfully extracted from navigation on Wikipedia pages. However, the navigational data used in the corresponding works are sparse and expected to be biased since they have been collected in the context of games. In this paper, we raise this limitation and explore if semantic relatedness can also be extracted from unconstrained navigation. To this e...
متن کامل